Background - Paper 2

SELF-ORGANIZING MODELLING AND DECISION SUPPORT IN ECONOMICS

JOHANN-ADOLF MUELLER and FRANK LEMKE
published in Proceedings of the IMACS Symposium on Systems Analysis and Simulation 1995

1. KNOWLEDGE EXTRACTION FROM DATA

Problems of complex objects modelling can be solved by deductive logical-mathematical or by inductive sorting-out methods. Deductive methods have advantages in the cases of rather simple modelling problems when the theory of the object being modelled is known and therefore it is possible to develop a model from physically-based principles using users knowledge of the process. Besides these information aspect the praxis relevant applicability of modelling techniques and tools act a significant part for their extensive and various supply at user support. The user is normally interested in the solution of the initial problem and barely has any expert knowledge about deductive mathematical modelling.
Efforts of using the known tools of artificial intelligence were not successful in many cases in the past. That is, because methods of artificial intelligence are based on knowledge extraction of human skills in a subjective and creative domain -model building. Besides these aspects, in such a way it's not possible to solve the significant problems of modelling for complex systems such as inadequate a priori information, great number of unmeasurable variables, noised and extremly short data samples and ill-defined objects with fuzzy characteristics. In this case knowledge extraction from data, i.e. to derive a model from experimental measurements using inductive methods has advantages in cases of rather complex objects having only a few a priori knowledge.

One development direction that take up the practical demands represents the self-organization of mathematical models which is realizable by means of statistical learning networks like GMDH algorithms.

2. SELFORGANIZE!

In classical GMDH algorithms the partial models have to be chosen wether to be linear or nonlinear functions in each generated layer. Lemke¹ has developed an algorithm for the generation of optimum partial models. A complete polynom of second degree will be optimized
f(x_i, x_j) = a₀+a₁x_i+a₂x_j+a₃x_ix_j+a₄x_i²+a₅x_j²,

using various selection criteria like the PESS criterion. In distinction to classical algorithms this one has the ability to synthesize linear or nonlinear models of optimal complexity depending on the object structure and a meaning reduction of model complexity related to the existing noise level of the data. This results in a more flexible modelling in each layer because the partial models could consist of no any (y=a₀), one or both input variables of every possible combination depending on their actual contribution. The aim is to avoid for short and very noisy data samples inclusion of redundant variables in modelling which, ones part of the model, couldn't be excluded afterwards. So, at the end it could be expected to get simpler models.

3. ANALYSIS OF SYSTEMS OF CHARACTERISTICS

Successful applications of GMDH algorihms are known especially in those areas where theoretical systems analysis is not applicable because of the complicatedness of the object being examined, the status of knowledge of the related scientific theory and the required time. An important area especially for decision support systems is the analysis and prediction of systems of characteristics. In the following we present one newer example in which SelfOrganize!-tool was used.

3.1. Solvency Checking

Basis for the examination and automatic model synthesis were sets of 19 anonymous characteristics of 81 companies which have been served a banking establishment do decide a company's solvency. 10 decisions have been chosen from the bank to serve for results checking and the other 71 decisions were used as learning sets for modelling. There are several methodologies to obtain the required models using GMDH but in distinction to neural networks each of them deliver assertions of the influence of the individual characteristics on the decision.

A. Model of the dependence of the decision from the variables

There were generated linear y^M= ∑_i a_ix_i and nonlinear static models of the decision variable from the 19 characteristics x_i. The decision variable has been set related to the decision to +1 („positive") or -1 („negative"). All obtained models has extracted the variables x₅, x₈, x₁₀, x₁₅ as significant, e.g.

y^M= -3,4528 + 0,1174 x₅ + 0,1701 x₁₅ - 0,551 x₈ + 1,311 x₁₀.

These four variables could be interpreted as the main decision variables.

B. Modelling of independent systems of equations

An other and more expending way is the generation of linear respective nonlinear systems of equations separately for all positive and negative decisions. In the case of linear models it is

x⁺=A⁺x⁺ ; x^-=A^-x^- ; A={a_ij} , with a_ii=0.

Such systems better grasp the spectrum of decisions because they have a greater breadth of variation and could be interpreted, too. Then, the corresponding model values x_i⁺/ x_i^- will be calculated for the checking set variables x_i^c . The membership to class + or - was decided on the basis of the deviations Δ_i⁺ = x_i^c - x_i⁺ respectively Δ_i^- = x_i^c - x_i^- . The results in table I have been obtained in the following cases:

TABLE I Classifications obtained from systems of equations

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10

y - +/- + - + - - + + -

y^M + + + - + - - + + -

a. s⁺=∑_i |Δ_i⁺ |; s^-=∑_i= |Δ_i^-|.

b. s⁺=∑_i_Œ_N|Δ_i⁺ |; s^-=∑_i_Œ_N |Δ_i^- |,

in which N is the set of indices of those variables having influence in model A.

c. s⁺=∑_i_Œ_M+ Δ_i⁺ x_i^c ; s^-=∑_i_Œ_M- Δ_i^- x_i^c , in which M⁺, M^- are the sets of indices of those input variables the best fitting models were obtained (for positive and negative decisions).

d. A next way for decision making is to calculate for the variables x_i^c their deviation Δ_i⁺ and Δ_i^- and classify each variable on the basis of the minimum deviation. The final decision is made as a sum of all classifications.

C. Synthesis

A synthesis of different classifications enables to better describe the wide spectrum of possible decisions without lost of the explanation component. In table II a synthesis on the basis of majority decisions is shown.

TABLE II Synthesis of different classifications

checking set target model A model B.b model B.d synthesis value

c1 - + - - - true

c2 +/- + + + + true

c3 + + + + + true

c4 - - - - - true

c5 + + + + + true

c6 - - - - - true

c7 - - - - - true

c8 + + + + + true

c9 + + + - + true

c10 - - - - - true

4. CONCLUSIONS

The same data set was used for classification using different architectures and learning rules of neural networks². These investigations have shown once more the great influence of architecture and learning rules on the results (1-2 wrong classifications). Comparing these investigations, the main advantages of the inductive methods we used are that a synthesis of different classifications is simple and fast realizable on an objective basis, and second that they supply immediately the explanation component. This aspect is a significant demand2 on decision support systems to make their results more transparent and interpretable for users.

REFERENCES

1. F. Lemke, Systems analysis modelling simulation, 11(1994) 6.

2. R. Bischoff, C. Bleile and J. Graalfs, Wirtschaftsinformatik, 33(1991) 4.

NOTE: The name "SelfOrganize!" was replaced by "KnowledgeMiner".

Contact:
knowledgeminer@iworld.to julian@scriptsoftware.com

Date Last Modified: 03/23/99